probability value
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)
QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety
Lee, Taegyeong, Yoo, Jeonghwa, Cho, Hyoungseo, Kim, Soo Yong, Maeng, Yunho
The recent advancements in Large Language Models(LLMs) have had a significant impact on a wide range of fields, from general domains to specialized areas. However, these advancements have also significantly increased the potential for malicious users to exploit harmful and jailbreak prompts for malicious attacks. Although there have been many efforts to prevent harmful prompts and jailbreak prompts, protecting LLMs from such malicious attacks remains an important and challenging task. In this paper, we propose QGuard, a simple yet effective safety guard method, that utilizes question prompting to block harmful prompts in a zero-shot manner. Our method can defend LLMs not only from text-based harmful prompts but also from multi-modal harmful prompt attacks. Moreover, by diversifying and modifying guard questions, our approach remains robust against the latest harmful prompts without fine-tuning. Experimental results show that our model performs competitively on both text-only and multi-modal harmful datasets. Additionally, by providing an analysis of question prompting, we enable a white-box analysis of user inputs. We believe our method provides valuable insights for real-world LLM services in mitigating security risks associated with harmful prompts.
A Remaining Proofs from Section 4
Here we provide proofs for all the results in Section 4 that were excluded in the main paper. Here we prove important properties of our convex program. We start by recalling properties showed in [CSS19a]. Function f (X) is convex in X. Theorem A.2 The function f (X) is separable in each row and we define following notation to capture it. The above derivation satisfies the conditions of the lemma and we conclude the proof.
Understanding Model Calibration -- A gentle introduction and visual exploration of calibration and the expected calibration error (ECE)
To be considered reliable, a model must be calibrated so that its confidence in each decision closely reflects its true outcome. In this blogpost we'll take a look at the most commonly used definition for calibration and then dive into a frequently used evaluation measure for model calibration. We'll then cover some of the drawbacks of this measure and how these surfaced the need for additional notions of calibration, which require their own new evaluation measures. This post is not intended to be an in-depth dissection of all works on calibration, nor does it focus on how to calibrate models. Instead, it is meant to provide a gentle introduction to the different notions and their evaluation measures as well as to re-highlight some issues with a measure that is still widely used to evaluate calibration.
PDDLFuse: A Tool for Generating Diverse Planning Domains
Khandelwal, Vedant, Sheth, Amit, Agostinelli, Forest
Various real-world challenges require planning algorithms that can adapt to a broad range of domains. Traditionally, the creation of planning domains has relied heavily on human implementation, which limits the scale and diversity of available domains. While recent advancements have leveraged generative AI technologies such as large language models (LLMs) for domain creation, these efforts have predominantly focused on translating existing domains from natural language descriptions rather than generating novel ones. In contrast, the concept of domain randomization, which has been highly effective in reinforcement learning, enhances performance and generalizability by training on a diverse array of randomized new domains. Inspired by this success, our tool, PDDLFuse, aims to bridge this gap in Planning Domain Definition Language (PDDL). PDDLFuse is designed to generate new, diverse planning domains that can be used to validate new planners or test foundational planning models. We have developed methods to adjust the domain generators parameters to modulate the difficulty of the domains it generates. This adaptability is crucial as existing domain-independent planners often struggle with more complex problems. Initial tests indicate that PDDLFuse efficiently creates intricate and varied domains, representing a significant advancement over traditional domain generation methods and making a contribution towards planning research.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)
QUITE: Quantifying Uncertainty in Natural Language Text in Bayesian Reasoning Scenarios
Schrader, Timo Pierre, Lange, Lukas, Razniewski, Simon, Friedrich, Annemarie
Reasoning is key to many decision making processes. It requires consolidating a set of rule-like premises that are often associated with degrees of uncertainty and observations to draw conclusions. In this work, we address both the case where premises are specified as numeric probabilistic rules and situations in which humans state their estimates using words expressing degrees of certainty. Existing probabilistic reasoning datasets simplify the task, e.g., by requiring the model to only rank textual alternatives, by including only binary random variables, or by making use of a limited set of templates that result in less varied text. In this work, we present QUITE, a question answering dataset of real-world Bayesian reasoning scenarios with categorical random variables and complex relationships. QUITE provides high-quality natural language verbalizations of premises together with evidence statements and expects the answer to a question in the form of an estimated probability. We conduct an extensive set of experiments, finding that logic-based models outperform out-of-the-box large language models on all reasoning types (causal, evidential, and explaining-away). Our results provide evidence that neuro-symbolic models are a promising direction for improving complex reasoning. We release QUITE and code for training and experiments on Github.
- Asia > Singapore (0.04)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- (14 more...)
- Health & Medicine (0.68)
- Automobiles & Trucks (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
- (2 more...)
Probabilistic Classification of Near-Surface Shallow-Water Sediments using A Portable Free-Fall Penetrometer
Rahman, Md Rejwanur, Rodriguez-Marek, Adrian, Stark, Nina, Massey, Grace, Friedrichs, Carl, Dorgan, Kelly M.
The geotechnical evaluation of seabed sediments is important for engineering projects and naval applications, offering valuable insights into sediment properties, behavior, and strength. Obtaining high-quality seabed samples can be a challenging task, making in-situ testing an essential part of site characterization. Free Fall Penetrometers (FFP) have emerged as robust tools for rapidly profiling seabed surface sediments, even in energetic nearshore or estuarine conditions and shallow as well as deep depths. While methods for interpretation of traditional offshore Cone Penetration Testing (CPT) data are well-established, their adaptation to FFP data is still an area of research. In this study, we introduce an innovative approach that utilizes machine learning algorithms to create a sediment behavior classification system based on portable free fall penetrometer (PFFP) data. The proposed model leverages PFFP measurements obtained from locations such as Sequim Bay (Washington), the Potomac River, and the York River (Virginia). The result shows 91.1\% accuracy in the class prediction, with the classes representing cohesionless sediment with little to no plasticity, cohesionless sediment with some plasticity, cohesive sediment with low plasticity, and cohesive sediment with high plasticity. The model prediction not only provides the predicted class but also yields an estimate of inherent uncertainty associated with the prediction, which can provide valuable insight about different sediment behaviors. These uncertainties typically range from very low to very high, with lower uncertainties being more common, but they can increase significantly dpending on variations in sediment composition, environmental conditions, and operational techniques. By quantifying uncertainty, the model offers a more comprehensive and informed approach to sediment classification.
- North America > United States > Virginia (0.35)
- North America > United States > Florida > Alachua County > Gainesville (0.14)
ROME: Memorization Insights from Text, Logits and Representation
Li, Bo, Zhao, Qinghua, Wen, Lijie
Previous works have evaluated memorization by comparing model outputs with training corpora, examining how factors such as data duplication, model size, and prompt length influence memorization. However, analyzing these extensive training corpora is highly time-consuming. To address this challenge, this paper proposes an innovative approach named ROME that bypasses direct processing of the training data. Specifically, we select datasets categorized into three distinct types -- context-independent, conventional, and factual -- and redefine memorization as the ability to produce correct answers under these conditions. Our analysis then focuses on disparities between memorized and non-memorized samples by examining the logits and representations of generated texts. Experimental findings reveal that longer words are less likely to be memorized, higher confidence correlates with greater memorization, and representations of the same concepts are more similar across different contexts. Our code and data will be publicly available when the paper is accepted.
- Asia > China > Beijing > Beijing (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- (7 more...)